NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Multivariate genome-wide association analysis by iterative hard thresholding

https://doi.org/10.1093/bioinformatics/btad193

Chu, Benjamin B; Ko, Seyoon; Zhou, Jin J; Jensen, Aubrey; Zhou, Hua; Sinsheimer, Janet S; Lange, Kenneth (April 2023, Bioinformatics)
Marschall, Tobias (Ed.)
Abstract MotivationIn a genome-wide association study, analyzing multiple correlated traits simultaneously is potentially superior to analyzing the traits one by one. Standard methods for multivariate genome-wide association study operate marker-by-marker and are computationally intensive. ResultsWe present a sparsity constrained regression algorithm for multivariate genome-wide association study based on iterative hard thresholding and implement it in a convenient Julia package MendelIHT.jl. In simulation studies with up to 100 quantitative traits, iterative hard thresholding exhibits similar true positive rates, smaller false positive rates, and faster execution times than GEMMA’s linear mixed models and mv-PLINK’s canonical correlation analysis. On UK Biobank data with 470 228 variants, MendelIHT completed a three-trait joint analysis (n=185 656) in 20 h and an 18-trait joint analysis (n=104 264) in 53 h with an 80 GB memory footprint. In short, MendelIHT enables geneticists to fit a single regression model that simultaneously considers the effect of all SNPs and dozens of traits. Availability and implementationSoftware, documentation, and scripts to reproduce our results are available from https://github.com/OpenMendel/MendelIHT.jl.
more » « less
Full Text Available
JBrowse Jupyter: a Python interface to JBrowse 2

https://doi.org/10.1093/bioinformatics/btad032

De Jesus Martinez, Teresa; Hershberg, Elliot A; Guo, Emma; Stevens, Garrett J; Diesh, Colin; Xie, Peter; Bridge, Caroline; Cain, Scott; Haw, Robin; Buels, Robert M; et al (January 2023, Bioinformatics)
Marschall, Tobias (Ed.)
Abstract MotivationJBrowse Jupyter is a package that aims to close the gap between Python programming and genomic visualization. Web-based genome browsers are routinely used for publishing and inspecting genome annotations. Historically they have been deployed at the end of bioinformatics pipelines, typically decoupled from the analysis itself. However, emerging technologies such as Jupyter notebooks enable a more rapid iterative cycle of development, analysis and visualization. ResultsWe have developed a package that provides a Python interface to JBrowse 2’s suite of embeddable components, including the primary Linear Genome View. The package enables users to quickly set up, launch and customize JBrowse views from Jupyter notebooks. In addition, users can share their data via Google’s Colab notebooks, providing reproducible interactive views. Availability and implementationJBrowse Jupyter is released under the Apache License and is available for download on PyPI. Source code and demos are available on GitHub at https://github.com/GMOD/jbrowse-jupyter.
more » « less
Full Text Available
CONSTAX2: improved taxonomic classification of environmental DNA markers

https://doi.org/10.1093/bioinformatics/btab347

Liber, Julian A; Bonito, Gregory; Benucci, Gian Maria (May 2021, Bioinformatics)
Marschall, Tobias (Ed.)
Abstract Summary CONSTAX—the CONSensus TAXonomy classifier—was developed for accurate and reproducible taxonomic annotation of fungal rDNA amplicon sequences and is based upon a consensus approach of RDP, SINTAX and UTAX algorithms. CONSTAX2 extends these features to classify prokaryotes as well as eukaryotes and incorporates BLAST-based classifiers to reduce classification errors. Additionally, CONSTAX2 implements a conda-installable command-line tool with improved classification metrics, faster training, multithreading support, capacity to incorporate external taxonomic databases and new isolate matching and high-level taxonomy tools, replete with documentation and example tutorials. Availability and implementation CONSTAX2 is available at https://github.com/liberjul/CONSTAXv2, and is packaged for Linux and MacOS from Bioconda with use under the MIT License. A tutorial and documentation are available at https://constax.readthedocs.io/en/latest/. Data and scripts associated with the manuscript are available at https://github.com/liberjul/CONSTAXv2_ms_code. Supplementary information Supplementary data are available at Bioinformatics online.
more » « less
Full Text Available

Search for: All records